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METHOD A 3NP APPARA TUS FOR PERFORMING MULTIPLE DE SCRIPTION 
MOTION COMPENSATION USING HYBRID PREDICTIVE CODES 



The present invention relates generally to multiple description coding (MDC) of 
data, speech, audio, images, video and other tj^es of signals for transmission over a 
network or other type of communication mediimi. 

A large firaction of the information that flows across today's networks is useful 
even in a degraded condition. Examples include speech, audio, still images and video. 
When this information is subject to packet losses, retransmission may be impossible due to 
real-time constraints. Superior performance with respect to total transmitted rate, 
distortion, and delay may sometimes be achieved by adding redundancy to the bit stream 
rather than repeating lost packets. 

Redundancy may be added to a bit stream in one way through multiple description 
coding (MDC) wherein the data is broken into several streams with some redundancy 
among the streams. When all the streams are received, one can guarantee low distortion at 
the expense of having a slightly higher bit rate than a system designed purely for 
compression. On the other hand, when only some of the streams are received, the quality of 
the reconstruction degrades gracefully, which is very unlikely to happen with a system 
designed purely for compression. Unlike multi resolution or layered source coding, there 
is no hierarchy of descriptions; thus multiple description coding is suitable for erasure 
chaxmels or packet networks without priority provisions. 

Multiple description coding can be implemented in a number of ways. One way is 
by splitting an incoming video stream into an arbitrary subset of channels by collecting the 
odd and even frame sequence separately at the encoder and coding the resultant temporally 
sub-sampled sequences independently. Upon receiving one of the sub-sampled sequences 
at tiie decoder, the video stream can be decoded at half the framQ rate. Due to the 
correlated nature of the video stream, receiving only one of the sub-sampled sequences 
allows for the recovery of intermediate firames using motion compensated error 
concealment techniques. This technique is described in greater detail in Wenger et al., 
*TBrror resilience support in H.263+,", IEEE Transactions on Circuits and Systems for 
Video Technology, pp. 867-877, November 1998. 

To achieve error resilience, Wang and Lin, "Error resilient video coding using 
multiple description motion compensation," IEEE Trans. Circuits and Systems for Video 
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Technology, vol. 12, no. 6, pp. 4348-52, June 2002, describe one method for implementing 
multiple description coding. In accordance with this approach, temporal predictors allow 
the encoder to use both the past even and odd firames while encoding, thus creating a 
mismatch between the encoder and the decoder when only one description is received at 
the decoder. The mismatch error is explicitly encoded to overcome tiiis problem. The 
main benefit of allowing the encoder to use both odd and even frame sequence for 
prediction is in terms of coding efficiency. By changing the temporal filter taps, the 
amount of redundancy can be controlled. The method disclosed provides reasonable 
flexibility between the amount of redundancy and the error resilience. 

A drawback of the approach of Wang and Lin is that it is limited to only I and P 
frames (no B-frames). A ftulher drawback of the approach is that it does not allow for 
multi-frame prediction like that employed in H.26L. These drawbacks litnit the coding 
efficiency of MDMC and also require full proprietary implementations instead of using 
available codec modules. 

The invention provides an improved multiple description coding (MDC) method 
and apparatus which overcomes the drawbacks described above. Specifically, the coding 
method of the invention extends multi-description motion compensation (MDMC) by 
allowing for multi-frame prediction and is not limited to only I and P frames. Further, the 
coding method of the invention extends MDMC for use with any conventional predictive 
codec, such as, for example, MPEG2/4 and H-26L. 

According to a first aspect of the invention, there is provided an improved MDMC 
encoder including three predictive coders, i.e., a top, middle and bottom coder. Input 
frames are supplied to the encoder as three separate inputs. The input frames are supplied 
to a central encoder. In addition, the input frames are divided or split into two sub-streams 
of frames, a first sub-stream comprising only the odd fi:ames and a second sub-stream 
comprising only the even frames. The first sub-stream comprised of odd fismes is 
provided as input to be encoded by the top encoder to yield an encoded odd frame 
sequence and the second sub-stream comprised of even frames is provided as input to be 
encoded by the bottom encoder to yield an encoded even firame sequence. It is noted that 
other embodiments may divide the fi:ames using different criteria such as, for example, an 
unbalanced division where every two of fliree frames is encoded by the top encoder and 
every third fi^me is encoded by the bottom encoder. The original undivided input stream of 
frames is applied to the central encoder which computes the.prediction of the odd firames 
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from the even frames. Additionally, the central encoder separately computes the prediction 
of the even frames from the odd firames. Prediction residuals are then computed between 
the central encoder and flie first and second side encoders, respectively. The MDMC 
encoder of the invention outputs the first computed prediction residual, corresponding to 
the prediction of the even frames, along with the output of the top encoder and outputs the 
second computed prediction residual, corresponding to the prediction of the odd frames, 
along with the output of the bottom encoder. 

According to a second aspect of the invention there is provided a method of 
encoding a video signal representing a sequence of frames, the method comprising splitting 
the sequence of firames into a first sub-sequence and a second sub-sequence, applying the 
first sub-sequence to a first side encoder, applying the second sub-sequence to a second 
side encoder, applying the original xmsplit sequence of frames to a central encoder, 
computing a first prediction residual between the output of the first side encoder and the 
central encoder, computing a second prediction residual between the output of the second 
side encoder and the central encoder, combining the first prediction residual and the output 
of the first side encoder as a first data sub-stream, combining the second prediction residual 
and the output of the second side encoder as a second data sub-stream, separately 
transmitting the first and second data sub-streams. 

Advantages of the invention include: 

(1) Any conventional predictive coder may be used for the top and bottom encoders. 
Further, the top and bottom predictive coders can advantageously include B-fi:ames and 
multiple prediction motion compensation 

(2) Any of the top, middle and bottom predictive encoders can be a scalable 
encoder (e.g., FGS-like or data-partitioning like where the motion vectors (MVs) are sent 
first, temporal scalability etc.) . For example, in the case where only the ntiiddle encoder is 
a scalable encoder, the middle encoder will send only as much information as the chaxmel 
allows. In an extreme case wben it is determined that the available bandwidth is very low, 
only the information encoded Toy the side-coders will be transmitted. As additional 
bandwidtb becomes available, then as much of the mismatch signal as the channel allows 
will be transmitted using fho scalable middle encoder. 

(3) To limit the complexity of the system, the prediction fi:om odd/even frame 
sequence of the current even/odd fi:ame for determining the mismatch signal can be made 
from B-frames. 
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(4) Instead of computing and coding the side prediction errors ((i.e,, the errors 
between tiie even-ftames and odd-frames for the side coders) as is conventional and also 
the mismatch between die side prediction eiror and the central error (i.e., die error between 
the current-frame and the prediction JBcom the previous two frames), alternatively, the 
central error is computed. 

Referring now to the drawings where like reference numbers represent 
corresponding parts throughout: 

FIG. 1 illustrates an MDMC encoder according to one embodiment of the invention. 

Multiple Description Coding (MDC) refers to one form of compression where the 
goal is to code an incoming signal into a number of separate bit-streams, where the 
multiple bit-streams are often referred to as multiple descriptions. These separate bit- 
streams have ttxe property that they are all independently decodable from one another. 
Specifically if a decoder receives any single bit-stream it can decode that bit-stream to 
produce a usefUl signal (without requiring access to any of the other bit-streams). MDC has 
the additional property that the quality of the decoded signal improves as more bit-streams 
are accurately received. For example, assume that a video is coded with MDC into a total 
of N streams. A.s long as a decoder receives any one of these N streams it can decode a 
useful version of the video. If the decoder receives two streams it can decode an improved 
version of the video as compared to the case of only receiving one of the streams. This 
improvement in quality continues xmtil the receiver receives all N of the streams, in which 
case it can reconstruct the maximum quality. 

There are a number of different approaches to achieve MDC coding of video. One 
approach is to independently code different frames into different streams. For example, 
each frame of a video sequence may be coded as a single frame (independently of the other 
frames) using only intra frame coding, e.g. JPEG, JPEG-2000, or any of the video coding 
standards (e.g. ]VIPEG-l/2/4, H.26-1/3) using only I-frame encoding. Then different frames 
can be sent in the different streams. For example, all the even fimne sequence may be sent 
in stream 1 and all the odd frames may be sent in stream 2. Because each of the firames is 
independently decodable fix)m the other fixunes, each of the bit-streams is also 
independently decodable fix>m the other bit-stream. This simple form of MDC video 
coding has flie properties described above, but it is not very efficient in terms of 
compression because of the lack of inter-fi:ame coding. 
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Before describing Fig. 1 in detail, we recall some definitions concerning the 
hierarchical arrangement of the pixels within a digitized picture and the prediction strategy 
as used in MPEG2 standard. Both luminance and chrominance samples (pixels) are 
grouped into blocks each made of an S.times.S matrix (8 rows of 8 pixels each); a certain 
number of luminance and chrominance blocks (e. g, 4 blocks of luminance data and 2 
corresponding blocks of chrominance data) form a macro-block; the digitised picture then 
comprises a matrix of macro-blocks of which the size depends on the profile (i. e. on the 
resolution) chosen and on the power supply frequency: for instance, in case of 50 Hz power 
supply, the size can range firom a minimum of 18,times.32 macro-blocks to a maximum of 
72.times.120. Pictures can in turn have a franae structure (in which pixels of subsequent 
rows pertain to different fields) or a field structure (in which all pixels pertain to the same 
field). As a consequence, macro-blocks may have a frame or field structure, as well. 
Pictures are in turn organized into groups of pictures, in which the first picture is always an 
I picture, which is followed by a number of B pictures (bi-directionally interpolated 
pictures, which have been submitted to fonvard or backward prediction or to both, 
Torward^ meaning that prediction is based on a previous reference picture and ^backward^ 
meaning that prediction is based on a future reference picture) and then by a P picture 
which, being used for prediction of the B pictures, is to be encoded immediately after the I 
picture. 

Referring now to Fig, 1, a source, not shown, supplies the encoder 200 with a 
sequence of frames 201 (i.e., a frame structure) already arranged in the coding order, i. e. 
an order making the reference pictures available before the pictures utilizing them for 
prediction. The full frame sequence 201 is received by a motion estimation unit (not shown) 
which is to compute and emit one or more motion vectors for each macro-block in a picture 
being coded, and a cost or error associated with the or each vector. The encoder 200 
includes a first side encoder (side encoder 1) 202, a central encoder 204 and a second side 
encoder 206. The ftill frame sequence 201 is SLppUed in its entirety to the central encoder 
204. A first subset 210 of the full ficame sequence 201, which in the present embodiment 
constitutes the even frame sequence 210 subset of the frill frame sequence 201, is applied 
to the first side encoder 202. A second subset 220 of ttie frill fi:ame sequence 201, which in 
the present embodiment constitutes the odd firame sequence 220 of the friU frame sequence 
201, is applied to the second side encoder 206. 

The prediction encoding operation will now be summarized. 
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A. First Side Encoder 202 

Odd fiame sub-sequence 210, which comprises a subset of input sequence 201, is 
applied to the first side encoder 202. It should be noted that the first side encoder 202 may 
be advantageously embodied as any conventional predictive codec (e.g., MPEG- 1/2/4, 
H.26-1/3). The odd fi:ame sub-sequence 210 is encoded by the first side encoder 202 
which outputs encoded odd firame sub-sequence 211. Encoded odd firame sub-sequence 
21 1 is included as one component to be output in the first data sub-stream 245. The 
encoded odd firame sub-sequence 21 1 is also supplied as an input to central encoder sub- 
module 230, to be described below. 

B . Second Side Encoder 206 

Even firame sub-sequence 220, which comprises a subset of input sequence 220, is 
applied to the second side encoder 206. It should be noted that the second side encoder 
206, similar to the first side encoder 202, may also be advantageously embodied as any 
conventional predictive codec (e.g., MPEG-1/2/4, H.26-1/3). The even fi^e sub- 
sequence 220 is encoded by the second side encoder 206 which outputs encoded even 
frame sub-sequence 212. The encoded even firame sub-sequence 212 is included as one 
component to be output in die second data sub-stream 255. The encoded even firame sub- 
sequence 212 is also supplied as an input to central encoder sub-module 232, to be 
described below. 
C. Central Encoder 204 

Full frame sequence 201 is applied to the central encoder 204. 

Central encoder sub-module 250 computes a first set of motion vectors 214 and 
also computes and encodes the even firame prediction sequence 215, which constitutes the 
prediction of even frames from the odd frames of input sequence 201. The central encoder 
sub-module 250 outputs the even frame prediction sequence 215 and the first motion 
vector sequence 214, both of which are supplied as input to central encoder sub-module 
230. 

Central encoder sub-module 260 computes a second set of motion vectors 216 and 
also computes and encodes the odd frame prediction sequence 217, which constitutes the 
prediction of odd frames from the even firames of input sequence 201. The central encoder 
sub-module 250 outputs the odd fiame prediction sequence 217 and the second motion 
vector sequence 216, both of which are supplied as input to central encoder sub-module 
230. 



6 



Central encoder sub-module 230 performs two functions or processes. A first 
process is directed to encoding the first set of motion vectors 214 received fix)m sub- 
module 250 to output a first set of encoded motion vectors 218. The second function or 
process is directed to computing a first prediction residual 221, which may be computed as: 

First Prediction residual = Cc - Cg (1)^ 
where Oc = even fiame prediction firame sequence 215, and 
es = encoded odd . firame sub-sequence 211. 

The central encoder sub-module 230 output includes the encoded first prediction 
residual 221 along with flie first set of coded motion vectors 218. These outputs are 
combined with the encoded odd firame sequence 211 (Point A) and collectively output as 
the first data sub-stream 245. 

Similarly, the second prediction residual is computed for inclusion in the second 
data sub-stream 255 as follows: 

Second Prediction residual = e© - eg (2), 

Where Cc = odd frame prediction firame sequence 217, and 
Cs = encoded even frame sub-sequence 212, and 

The central encoder sub-module 232 output includes the encoded second prediction 
residual 222 along with the second set of coded motion vectors 219. These outputs are 
combined with the encoded even frame sequence 212 (Point B) and oulput as the second 
data sub-stream 255. 

The foregoing description of the preferred embodiments of the invention has been 
presented for purposes of illustration and description. They are not intended to be 
exhaustive or to limit the invention to the precise form disclosed, and obviously many 
modifications and variations are possible in light of the above teachings. Such 
modifications and variations that are apparent to a person skilled in the art are intended to 
be mcluded within the scope of this invention as defined by the accompanying claims. 
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